In-RDBMS Hardware Acceleration of Advanced Analytics

نویسندگان

  • Divya Mahajan
  • Joon Kyung Kim
  • Jacob Sacks
  • Adel Ardalan
  • Arun Kumar
  • Hadi Esmaeilzadeh
چکیده

The data revolution is fueled by advances in several areas, including databases, high-performance computer architecture, and machine learning. Although timely, there is a void of solutions that brings these disjoint directions together. This paper sets out to be the initial step towards such a union. The aim is to devise a solution for the in-Database Acceleration of Advanced Analytics (DAnA). DAnA empowers database users to leap beyond traditional data summarization techniques and seamlessly utilize hardware-accelerated machine learning. Deploying specialized hardware, such as FPGAs, for in-database analytics currently requires hand-designing the hardware and manually routing the data. Instead, DAnA automatically maps a high-level specification of indatabase analytics queries to the FPGA accelerator. The accelerator implementation is generated from a User Defined Function (UDF), expressed as part of a SQL query in a Python-embedded Domain-Specific Language (DSL). To realize efficient in-database integration, DAnA-generated accelerators contain a novel hardware structure, Striders, that directly interface with the buffer pool of the database. DAnA obtains the schema and page layout information from the database catalog to configure the Striders. In turn, Striders extract, cleanse, and process the training data tuples, which are consumed by a multi-threaded FPGA engine that executes the analytics algorithm. We integrated DAnA with PostgreSQL to generate hardware accelerators for a range of real-world and synthetic datasets running diverse ML algorithms. Results show that DAnA-enhanced PostgreSQL provides, on average, 11.3× endto-end speedup for real datasets, with the maximum at 58.2×. Moreover, DAnA-enhanced PostgreSQL is 5.4× faster, on average, than the multi-threaded Apache MADLib running on Greenplum. DAnA provides these benefits while hiding the complexity of hardware design from data scientists and allowing them to express the algorithm in ≈30-60 lines of Python code.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Application of Big Data Analytics in Power Distribution Network

Smart grid enhances optimization in generation, distribution and consumption of the electricity by integrating information and communication technologies into the grid. Today, utilities are moving towards smart grid applications, most common one being deployment of smart meters in advanced metering infrastructure, and the first technical challenge they face is the huge volume of data generated ...

متن کامل

A Fuzzy TOPSIS Approach for Big Data Analytics Platform Selection

Big data sizes are constantly increasing. Big data analytics is where advanced analytic techniques are applied on big data sets. Analytics based on large data samples reveals and leverages business change. The popularity of big data analytics platforms, which are often available as open-source, has not remained unnoticed by big companies. Google uses MapReduce for PageRank and inverted indexes....

متن کامل

Advanced Electron Linacs

The research into advanced acceleration concepts for electron linear accelerators being pursued at SLAC is reviewed. This research includes experiments in laser acceleration, plasma wakefield acceleration, and mmwavelength RF driven accelerators.

متن کامل

Comparison of Data Warehousing DBMS Platforms

Although relational databases (RDBMS) are the most common choice for data warehouse implementations, their record-based structure is far from ideal. As data volumes grow and users demand more sophisticated analytical capabilities, the deficiencies of the RDBMS to data storage become more conspicuous. RDBMS data warehouse systems are difficult to design; extremely inefficient in their use of dis...

متن کامل

On Current Strategies for Hardware Acceleration of Digital Image Restoration Filters

Two advanced design methodologies for hardware acceleration of a standard digital image restoration algorithm are explored and compared. The first one is the custom-designed hardware approach, leading to an application-specific integrated circuit (ASIC) implementation. The second one consists of the configurable processor approach, yielding a mixed hardware/software implementation running on a ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1801.06027  شماره 

صفحات  -

تاریخ انتشار 2018